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APPROXIMATION OF BOUNDS ON MIXED LEVEL 
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Mixed level orthogonal arrays are basic structures in experimen- 
tal design. We develop three algorithms that compute Rao and Gil- 
bert- Varshamov type bounds for mixed level orthogonal arrays. The 
computational complexity of the terms involved in these bounds can 
grow fast as the parameters of the arrays increase and this justifies 
the construction of these algorithms. The first is a recursive algo- 
rithm that computes the bounds exactly, the second is based on an 
asymptotic analysis and the third is a simulation algorithm. They 
are all based on the representation of the combinatorial expressions 
that appear in the bounds as expectations involving a symmetric ran- 
dom walk. The Markov property of the underlying random walk gives 
the recursive formula to compute the expectations. A large deviation 
(LD) analysis of the expectations provide the asymptotic algorithm. 
The asymptotically optimal importance sampling (IS) of the same 
expectation provides the simulation algorithm. Both the LD analy- 
sis and the construction of the IS algorithm uses a representation of 
these problems as a sequence of stochastic optimal control problems 
converging to a limit calculus of variations problem. The construction 
of the IS algorithm uses a recently discovered method of using sub- 
solutions to the Hamilton Jacobi Bellman equations associated with 
the limit problem. 



1. Introduction. Mixed level orthogonal arrays (OAs, for short) are 
fundamental to experimental design. Each row of an array is thought of as a 
run of an experiment; each entry of the row is the value of a parameter of the 
system being tested. The goal of the experiment is to test as wide a range 
of parameter values of the system as possible. The number of parameters 
and which values these parameters can take (i.e., the row length and the 
alphabets where the row entries take their values) are determined by the 
characteristics of the system being tested. The remaining parameters of an 
OA are its number of rows N and its strength t. The strength of an OA is 
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t when the OA is capable of exploring all possible interactions of up to t 
number of parameters of the system (see Definition 2.1). N is the number of 
experiments that the OA describes. A high t and a low N is desirable. The 
Rao bound (see (1) below), first proved for fixed level orthogonal arrays by 
Rao [25], gives a lower bound on A'' in terms of t, the row length and the 
system parameters (i.e., the row length and alphabet sizes). Our first object 
of study is this bound and the goal is to develop algorithms that compute 
exactly and approximately the right side of this bound. 

The Rao bound is a necessary bound, all OAs satisfy it. There are also 
sufficient bounds that arise from constructions. One well known construction 
method for ordinary OAs is by taking the dual of error correcting codes 
[22]. [14] generalizes this idea by defining error-block codes, which are error- 
correcting codes in which one can specify what alphabet to be used for each 
entry of the code word. Furthermore, [14] notes that the duals of error block 
codes are mixed level orthogonal arrays. We use this idea and construction 
of error block codes in [23] to obtain construction of orthogonal arrays whose 
parameters satisfy a Gilbert- Varshamov (GV) type bound (see (2) below). 
Our second object of study is this bound. 

In subsection 2.2 we calculate the computational complexity of directly 
computing the Rao Bound (1) and the GV bound (2). We see that this 
complexity is polynomial in the strength parameter, and the degree of the 
polynomial is one more the number of different type of alphabets used in 
the OA. If many different types of alphabets are used in an OA, which is 
typical in real life experimental designs, the Rao and the GV bounds become 
inefficient to compute directly from their original representations (1) and 
(2). This potentially high complexity of the direct computation of these 
bounds justifies the construction of new algorithms to compute them. In 
the present paper we develop three algorithms for this purpose. The simple 
result that underlies these is an expectation representation of the Rao and 
the GV bounds that we derive in Sections 3 and 6. The expectation is that 
of a function of a random walk whose increments are either or 1 with 
equal probability. The walk takes n steps, the row length of the OA, and 
accumulates a cost throughout its excursion as follows: if the walk goes up at 
the i^'^ step, the accumulated cost increases by a factor one less the alphabet 
size of the factor of the OA. The aforementioned representation is the 
expectation of this accumulated cost over sample paths which are less than 
t/2 at the last step of the random walk for the Rao bound and less than 
t — 1 for the GV bound. 

Once these expectation representations are available, it is straightforward 
to use them in several ways to obtain algorithms to compute the bounds. 
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The Markov property of the underlying walk gives the recursive formula (5). 
The complexity of this formula is a second order polynomial in the strength 
parameter and is far less than the original formulas when the number of 
alphabets is large. 

The asymptotic behavior of bounds such as the Rao and the GV bounds 
is a basic question to ask. [23] carries out an asymptotic analysis of the 
GV bound for orthogonal arrays with two alphabets. To our knowledge, 
no results concerning the asympotic behavior of either the GV or the Rao 
bound for general mixed level orthogonal arrays is available in the current 
literature. With our expectation representation an asymptotical analysis of 
these bounds becomes what is called a large deviations analysis (LD) in 
probability theory and we use the methods of the LD theory to carry it out. 
In Section 4 we use the stochastic optimal control approach to LD [15, 17, 5] 
to show that the right side of the Rao bound (1) grows exponentially in the 
row length n and identify the growth rate. Following [5], we use a relative 
entropy representation of our expectation of interest to write it as a discrete 
time stochastic optimal control problem. Under proper scaling, this control 
problem converges to a limit deterministic calculus of variations problem. 
Similar to [27, 8], the connection between the prelimit and the limit problems 
is established using the Hamilton Jacobi Bellman (HJB) equation associated 
with the limit problem (see Section 4 for the Rao bound and in Section 6 for 
the GV bound) . This analysis provides our second approximation algorithm. 
To the authors's knowledge the idea of using the limit HJB equation to 
compute large deviation limits first appeared in [6] in the context of analysis 
of queuing systems. 

The asymptotic analysis gives good approximations in an exponential 
scale. More accurate approximations can be obtained using simulation, which 
is possible because we have the expectation representations (4) and (40). 
However, these are expectations over sets with small probability (i.e., rare) 
for reasonable values of the strength parameter t. For such expectations, 
ordinary simulation would require a great number of samples for reliable 
estimates. A remedy to this is importance sampling, which means to change 
the sampling distribution to a distribution under which the set over which 
expectation is taken is not rare anymore. One modifies the estimator ac- 
cordingly by multiplying it with a likelihood ratio to account for the change 
of the sampling distribution. IS is a well known idea, it goes back at least 
to 1949, see, for example, [21, 29, 18, 9], and the references therein. 

For our problem, an importance sampling distribution will be one under 
which with high probability our random walk remains below t—1 or t/2 at its 
final step. There are many such distributions. Among these, one would like 
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to choose a distribution that minimizes the variance of the IS estimator. It is 
weh known that an exact solution of this optimization problem is as difficult 
as directly computing the expectation [20]. In situations such as the one 
covered in this article where the object of study is a sequence of expectations 
decaying or growing exponentially in a parameter, a compromise is to choose 
a sequence of estimators whose variance decay or grow exponentially at a rate 
twice the asymptotic decay or growth rate of the expectation itself. Such a 
sequence is called asymptotically optimal, see [29] and [9] and the references 
therein. To obtain such a sequence we will follow [9, 8] and represent the 
variance minimization problems in IS once again as a sequence of stochastic 
optimal control problems. Under proper scaling, these also converge to the 
same limit control problem as the one that emerges in the large deviations 
analysis. Theorem 5.2 asserts that a simple change of measure based on a 
piecewise linear subsolution of the HJB equation of the limit control problem 
is asymptotically optimal. This idea of using subsolutions to construct IS 
algorithms is from [8, 27, 12, 11] and is called the subsolution approach to 
IS. 

The use of randomized algorithms for counting is one of the central ideas 
in statistics. The use of importance sampling for this purpose seems to be 
relatively new. [3] is the first article that we are aware of that uses impor- 
tance sampling for purposes of counting. More recent articles since this work 
include [4, 2, 1]. The current work seems to be the first to use the subsolution 
method to construct asymptotically optimal IS algorithms for counting. 

The plan of the paper is as follows. The next section gives the definition 
of an orthogonal array and states the Rao and the GV bounds. It computes 
the computational complexity of the original combinatorial representation 
of these bounds. Section 3 derives the expectation representation of the Rao 
bound and states the exact recursive algorithm to compute it (equation 
(5)). Section 4 carries out the large deviations analysis of the expectation 
representation of the Rao bound. The final result here is Theorem 4.2 with 
characterizes the growth rate of the bound as a finite dimensional concave 
maximization problem. The dimension of the problem is the number of al- 
phabets used in the OA. Section 5 uses the ideas in the above paragraphs 
to construct an asymptotically optimal IS algorithm to estimate the Rao 
bound, the final result is Theorem 5.2. Section 6 does for the GV bound 
what was done for the Rao bound in Sections 4 and 5. This generalization 
requires only minor modifications. Section 7 provides numerical results that 
gives evidence that the constructed algorithms are effective in practice as 
well. 
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2. Definitions and Bounds. We begin with tiie following definition 
from [22]. 

Definition 2.1. A matrix A is said to be an OA{N, s^ S2 ■■■s''J ,t) if it 
has the following structure: 

1. A has N rows, 

2. Row length of A is li + I2 + ■ ■ ■ + la, the first li components of each 
row are from the alphabet the second I2 components from TLg^,..., 
the last Ifj components from Z^^ . 

3. Take any t columns Ci^Ci^-.-Ci^ of A and call the matrix formed by these 
columns A' . Take any string s of length t such that j*^ letter of s comes 
from the alphabet corresponding to column ij. Count the number times 
s occurs as a row of A' . This count is the same for all s. 

The last item is the orthogonality property and t is the strength of the 
orthogonal array. This type of arrays are called mixed level because the 
columns are allowed to be from different alphabets (second property above). 

The parameters of any mixed level orthogonal array has to satisfy the 
Rao bound: 

i/2 - // \ 

(1) N>Y. I[(l"']i''^m-ir-. 

This bound corresponds to the sphere packing bound for error block codes. 
For o" = 1 (1) was proved in [25], for the proof of the general case see [22, 
page 201]. 

2.1. Sufficient bounds. The duality idea mentioned in the introduction 
and block error code constructions implied by Theorem 3.1 in [23] give mixed 
level orthogonal arrays whose parameters satisfy the following conditions: 
Si = f?"^* where g is a prime power, 

(2) iV. > g E _ \) is. - 1)-- n ) - D- > N. 

This is a sufficient bound; that is, it is known that OA's with these param- 
eters do exist. Bounds like (2) are called Gilbert- Varshamov type bounds in 
coding theory. 

The right side of (2) has essentially the same structure as that of (1). The 
key difference between these bounds is the upper limit of the outer sum: (1) 
goes up to t/2 whereas (2) goes up to t — 1. 
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In the next subsection we will study the computational complexity of 
directly evaluating (1) or (2). 

2.2. Computational complexity of evaluating (1) and {2).. It follows from 
their definitions that the evaluations of (1) and (2) have the same compu- 
tational complexity. Therefore, it is enough to consider one of them. 

The right side of (1) involves a partitioning of each i less than t/2 into a 

sum of a integers. The number of such partitions is ^\ Then the 

number of operations needed to compute the right side of (1) is bounded 
below by 



(3) 




i/2-l 



<T+1 



i=0 



where C is a constant that depends only on a. If the strength parameter t 
grows linearly in n, i.e., if t = /xn, where // € (0,1), a direct computation 
of (1) requires 0(n'^+^) operations. The present paper is aimed at finding 
methods to compute (1) and (2) more efficiently. 

The next section presents a simple probabilistic representation of (1), 
which forms the basis for all of the results and algorithms presented in this 
paper. 

3. Expectation Representation. Let Xi be independent and identi- 
cally distributed (iid) Bernoulli random variables with P{Xi = 1) = P{Xi = 
0) = 1/2. Let Sk = Xi + X2^ hXfc. Define the following "running cost:" 



r{x,j) 



if X 
if X 





1, and 



Efc=iZfc + i<i<Eki4. 



(1) can be written in the form 



(4) N > 2"E 



{5„<t/2} 



n 



^ hs.stmU'^rix.j) 



This is an expectation over the trajectories of Sk that stay below the level 
t/2 at step n. At each step the random walk accumulates a running cost 
r; the cost depends on the step number and the current step. The random 
walk can be thought of as a scan of the letters of a row of the array. At each 
step we flip a coin to decide whether the current letter will be included in 
the computation. If the decision is yes, i.e., li Xi = 1 and the random walk 
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goes up, then the current bound is muhiphed with 2(sj — 1) where Si is the 
alphabet size of the letter we are going over (this is the 2r term in (4)). 
The first sum in (1) group trajectories according to their positions at step 
n. For a position i < t/2, the second sum in (1) partition these i up-steps 
into different cost regions and the binomial coefficients count the number of 
possible ways Um up-steps can be taken in 1^ steps. 

A simple recursive algorithm to compute the Rao bound. Our first method 
to compute (1) is a recursive algorithm that computes the bound exactly. 
For integers < x < t/2 and < k < n define 



M{x,k) = E 



l{x'+s„_,<t/2} n 2r-(Xi,j) 



j=k+i 



The Rao bound (1) in terms of M is > M(0,0). Because Xi are iid and 
Si are their sum, M satisfies 



(5) 



M{x, k) = M{x + 1, + l)r(x, k) + M(x, k + 1), 



for X < t/2 and A; < n. In addition, we have the boundary conditions 
M(x,n) = for x < t/2 and M{t/2,k) = for k < n. These give an 
algorithm that takes only tn/2 steps to compute the Rao bound. If we write 
the strength parameter t as a fraction /x of n as t = /xn then the complexity 
analysis in the previous chapter implies that the direct evaluation of (1) 
will take at least 0(n°'+^) operations. Whereas the computation of the same 
bound using (5) will only take O(n^) operations. 

4. Large Deviations Analysis. The goal of this section is an asymp- 
totic analysis of the right side of (4) as n — > oo. In order for this analysis 
to be meaningful t and li need to grow with n as well. Therefore we assume 
that 



(6) t = fin, /X G (0, 1), k = nai, ^ = 1. 

The asymptotic analysis of (1) now consists of evaluating 



(7) 



lim — log E 

n n 



hs,.<t/2}X{2r{X,,j) 



For the evaluation of (7), we will follow [5] and begin by representing the 
logE[- • • ] term in it as a discrete time stochastic optimal control problem as 
follows. 
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Proposition 1. The following identity holds: 



n 



(8) 



logE l{c,„<,/2}n2K^j,j) 



n 



P{S„ < i^) = 1 



sup E Y.^ogr{X„j)-logp{X,\j,Sj.i) 



where the sup is over all transition probabilities p{-\-, ■) : Z2 x N x N — > [0, 1] 
that give the probability of the steps and 1 given the current position of 
and the current step number of the random walk S and P is the probability 
distribution defined by these measures on the path space of the random walk. 

The proof of this result is similar to that of Proposition 1.4.2 [5, page 31] 
and is omitted. The sup on the right side of (8) over all Markov chains on the 
sample paths of 5"^ such that the n^^ step is less than t/2 with probability 
1. The log term inside the sup corresponds to the entropy of •). 

Define Ai = J2)=i o-j and 



be the entropy function. As we observed earlier, the right side of (8) is a 
stochastic optimal control problem. Upon dividing it by n and scaling the 
time and space parameters with ^, and sending n to cx) one obtains the 
following limit deterministic optimal control problem: 



where the sup is over all measurable functions on [0, 1] with values in [0, 1] 
such that f^9{t)dt < /x/2. The rigorous connection between this optimal 
control problem and (8) can be established in several ways. For example, 
one can use the weak convergence approach of [5]. Another approach is 
via the HJB equation associated with the limit control problem (9) and a 
verification argument, which is followed in [8]. In this paper we will take 
this second path because the same method will also allow us to prove the 
asymptotic optimality of an IS estimator based on a subsolution of the limit 
HJB equation. 



= log(si - 1), Ai<t<Ai+i 



and let 



H{e) = -eiogo - (1-9) iog(i - 6) 



(9) 
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4.1. Solution to the limit control problem. For Ai < t < Aj+i, L{t,6) = 
log(sj — 1)9 + H{9) is a strictly concave function with no t dependence. 
Then Jensen's inequahty imphes that the optimal trajectory needs to be a 
straight line between times Ai and ^j+i. Therefore, it is enough to consider 
the optimization problem (9) over piecewise linear continuous paths and the 
sup in (9) equals 

(10) sup ai {6, log{si - 1) + H{e.i))^ , 
where the sup is subject to 

(11) ^^G(0,1), {a,e)=fi/2. 

The objective function of this finite dimensional constrained optimization 
problem is strictly concave and its constraints linear. Therefore, a straight- 
forward use of a Lagrange multiplier converts the problem to a one of root 
finding of a one dimensional monotone function. 

In the next subsection we will prove that a function defined based on (10) 
satisfies an HJB equation. We will use this fact to prove the convergence of 
(7) to (10). 

4.2. The limit Hamilton Jacobi Bellman equation. Let us generalize the 
problem in (9) so that the problem starts from any initial point x < at 
any time t € [0, 1]: 

(12) V{x, t) = sup f\f{s)9{s) + H{eis))]ds, 

e Jt 

where the sup is over all measurable 6{-) > such that x + // 9{s)ds < /i/2. 
The sup in (9) equals y(0, 0). Generalizing (10), for Ai<t < ^j+i we have 
that 

(13) 
V{x,t) 

= sup J (A,+i - t){e, iog(s, - 1) + H{e,) + iog(^j - 1) + ^(^j)) \ ' 

where the sup is subject to 

(14) 9j G (0, 1), X + 9^{Ai+i -t)+ Y "i^i ^ 

j=i+i 
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Let us now write V more explicitly. Firstly, the absolute maximizer of (13) 
without the constraints (14) is 



(15) 

Ue* satisfy (14), i.e., if 
(16) 



Sj - 1 



j=i+l 



then V equals 

V{x,t) = {t-Ai) 



+ E 

j=i+i 



Si 



\og{s^-l)+H{{si-l)/si) 
1 



log{s,-l)+H{{s,-l)/s,) 



If the absolute maximizers (15) do not satisfy (14) then one can use a La- 
grange multiplier A to solve (13): 



Then 
(17) 



log{sj - 1) + log 



A, j > i. 



Sj - 1 



1 



For these to give a solution to (13) they must satisfy (14): 

, -1 ^ Si - 1 



(18) 



{A 



i+l 



+ Si 



J2 «i 



1 ■ ^e^ + Sj 



1 



— X. 



For A = 0, the left side is by assumption greater than /z/2 — a; and for 
A = oo it is 0. Because it is monotone in A, there exists a unique A*(t, x) 
for which (18) is satisfied. By the implicit function theorem, A*(t, x) is twice 
differentiable in both t and x with bounded derivatives for t ^ Aj. And for 
t = Aj, A has right derivatives in t and an ordinary derivative in x. Because 
the function that is optimized in (13) is strictly concave, the stationary point 
given by A* is actually a global maximizer. 
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Define 

Si - 1 



+ E «i 

j=i+i 



log(si -l)+H 



In ligfit of the above computations, V{x,t) of (13) can be written more 
exphcitly as 



V{x,t) 



'V{t,0), if (16) holds, 

V{t, X*{x,t)), otherwise. 



One obtains the following proposition by ordinary calculus and implicit 
differentiation. 

Proposition 2. V is smooth except for t = Ai where it has direc- 
tional derivative Vt{x^t) which is defined as Vt{x,t) = limh\Q{V {x , t + h) — 
V{x,t))/h. Higher order t partial derivatives similarly exists. In particular 
for any t and x we have: 

V{x + 5,t + h) = V{x,t) + 5T4(x,t) + hVt{x,t) + c{x,t){5^ + h^) 

where sup^. j |c(x,t)| = C < oo. 

Now we state the HJB equation satisfied by V . 

Theorem 4.1. The following dynamic programming equation holds: 
(19) = sup {r{t)e + H{e) + y^(a;, t)e + Vt{x, t)} 

6»g[0,l] 

for {x,t) € [0,1^/2] X [0,1). 

Proof. Take {x,t) G [0,fi/2) x [0,1), a smah 5 > and (9 € [0,1]. (12) 
implies 

rt+S 

V{x,t)> / r{s)9 + H{9)ds + V{x + 96,t + 6) 

V{x, t) - V{x + e5,t + 6)> [logisi - 1) + H{9)]6. 

Because Vt and Vx exist, dividing both sides of the last display by 5 and 
letting 5 — > gives: 

-Vt-9Vx>logisi-l) + H{9). 
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Because this is true for all 9 € [0, 1] we have: 



> sup {f{t)e + H{e) + t)e + Vt{x, t)} 

fe[o,i] 

One replaces > with = by taking 9 to be the optimal control ^*(A*(x, t)). □ 

4.3. Convergence Analysis. In this subsection we formally connect the 
sequence of stochastic optimal control problems in (8) to the limit control 
problem (9) and its solution developed in the previous subsection. 

Figure 1 gives the level curves of V{x,t) and V6o(L"'2;Ji L'^^J) where 



Vn{x,i) = -logE 



n 



for ai = 02 = 03 = 1/3, si = 2, S2 = 30, S3 = 100 and ^ = 0.1. This 
figure suggests that Vn{nx,nt) — s- V{x,t) for all values of Our main 

convergence theorem, which we state and prove next, concerns the special 
case when = (0,0). 

Theorem 4.2. The large deviations limit in (7) equals y(0,0), i.e., 
(20) 



hm - log 2"E 

n n 



i=i 



sup<^^ai {edog{si-l)+H{ei))\, 



where the sup is over 
(21) 



E(0,1), (a,0)=/i/2. 



Proof. The proof will be a verification argument using V and the HJB 
equation (19). By Proposition 1 there exists p"(-|-, •) such that 



logE 



hSr.<t/2}\{2r{X,,j) 



E 



^r(X„i)-logp"(X,|j,5,_i) 



+ e{n) 



where e{n) and E is expectation with respect to p"(-|-, ■ 



V{0,0) =E[V{0,0) -V{Sn/n,l)] 



n-l 



E[V{Sj/n,j/n) - V{Sj+i/n, (j + l)/n) 

j=0 
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Fig 1. The level curves ofV{x,t) and Veo 



By Proposition 2 this equals 

C{n 



n 



V^{S,/n,j/n)Xj/n - Vt{S,+i/n, {j + l)/n) 



n 



where sup„ |C(n)| < oo. One can condition the last expectation on Sj to 
rewrite it as 

= £M + i ^ E [-V;(S,-/n,i/n)p"(l|5,/n,i/n) - VtiS,+,/n, (j + l)/n)] 



n n 



Now by Theorem 4.1 this last sum is greater than: 
C(n 



> 



n 



+ [f(j7n)r j'/n) - H {p"- {l\S ^ , j) 



which in turn equals 

r 1 " - 

= - + -^E[r(j, X,) - log(p"(X,|S,-,j))] 
Letting n go to infinity yields 



y(0,0) > limsup-E" 
n 



5]r(j,X,)-logp"(X,|5„j) 



> lim sup — log E 

n 



e. 
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For the reverse inequality we first note that the result of the optimization 
in (20) is continuous in the strength parameter fi which appears in the 
constraint (21). Let 9* be the optimizers of (20) when the in (21) is 
replaced with /i — 4e where e > is a small constant. Let 



(22) 



p*{l\x,j) 



if Ai < j/n < Ai+i 



and P* be the measure on the path space of (5, X) corresponding to p* . We 
would like to use P* on the right side of (8) to get a lower bound on its left 
side. Once this is done the law of large numbers would give us the bound 
we desire. The only problem is P*{Sn < /un/2) < 1 so -P* is not included 
in the set of measures over which the right side of (8) is optimized. This is 
a minor technical problem and can be handled as follows. By definition Xj 
is iid for Ai < j/n < ^j+i. Therefore the ordinary law of large numbers is 
applicable and gives: 



P*iSn/n> p/2-e) 



0. 



Then, the fact that P*{Sn < fin/2) 7^ 1 is not a major problem and can be 
dealt with by simply conditioning it on {Sn < pn/2}. 

The details of this argument is as follows. Let p* = P*{Sn < /xn/2) and 
define 



(23) 



1 



l{5„<AJn2}-P*- 



Under P*'^, {Xn , Sn) is a Markov chain whose transition probability is 



(24) 



p*'%l\s,j) =p*{l\s,j[ 



s + 1) 



P*{Sn<l^n/2\Sn-j-i = s) 
P*''^iSn < fin/2) = 1 and therefore by Proposition 1 we have: 



- log 2"E 
n 



n 



{S„<t/2}Y[r{Xj,j) 

n 

^\og r{Xj,j) 



logp*''iXj\j,X,_i) 



By (24) and (23) this equals 
1 1 



-E* 



Pn n 



1{5„<W2} J2 ^ogr{Xj,j) - logp*{Xj\j,Xj^i) 



+ logpl 



imsart-aos ver. 2009/02/27 file: oa-is.tex date: May 3, 2009 



APPROXIMATION OF BOUNDS ON ORTHOGONAL ARRAYS 



15 



9* and r are all positive and bounded. Therefore there exists a positive C 
such that 



C{l-P*n) 
Pn 



1 1 - 

> — -E* Yl log K^j- , j) - log r {Xj \j, Xj^i) + log p: 

Pn ^ j—i 
= log(«^ - 1) + ^(^D) + logK - 

By the law of large numbers — > 1. This, the last sequence of inequalities 
and the definition of li give: 



liminf ilog 2"E 

n 



MSr.<t/2}Y[r{Xj,j) 



> J2i^i9* iog{si - 1) + //(^n) = y(0, 0) - 5 

i=l 

where 5 is a small number that goes to with e. This inequality concludes 
the proof of this theorem. □ 

5. Importance Sampling. The expectation representation (4) of the 
Rao bound brings to mind the possibility of estimating it using simulation. 
Because the strength parameter t is usuahy a fraction of n and because the 
aforementioned expectation is over the set {Sn < for large values of 

n a direct simulation would require too many samples of Sn to converge. 
One can instead use importance sampling, which means to sample from a 
new simulation measure under which {Sn < t/2} is not rare. The samples 
are scaled by the Radon Nikodym derivative of the original measure with 
respect to the new sampling measure so that the simulation algorithm still 
estimates the probability under the original measure. The main problem in 
IS is the choice of the new sampling distribution. One tries to choose it so 
that it is practical to sample from it and that it nearly minimizes estimator 
variance. In the next subsection we briefly introduce the main ideas of IS in 
a general setting before we focus on its use in our current setup. 

IS is a well known method for estimating small probabilities, a very partial 
list of articles and books on the subject are [20, 21, 29, 24, 18, 9, 8, 27]. These 
works contain many more references to important works on the subject. 

5.1. IS Review. Take a probability space {Q,J-,P) and a measurable 
integrable function / : $7 — > M. Suppose P is a probability measure on (fi, J^) 
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with respect to which P is absolutely continuous. We have the following basic 
identity: 



(25) E[/] 



f{io)dP{co) 



dP 

f{u)-^{u)dP{u) 
n dP 



E 



dP 
dP 



where ^ is the Radon Nikodym derivative of P with respect to P and E [E] 

is expectation with respect to P [P]. The identity (25) suggests the following 
simulation algorithm to compute E[/]. Simulate iid copies wi, a;2,. • • , con of 
Lo from P and use the following to estimate E[/]: 



N 



SN 



N 



i=l 



dP 

f{i) = f{uji)—^{ui). 
dP 



By the law of large numbers s m 



E 



fiE. 

•' dP 



which by (25) equals E[/]. Fur- 



thermore by the linearity of the expectation and (25) one also has E[sjv] = 
^[/(l)] = ^[f]- Therefore sat is an unbiased estimator of E[/] that con- 
verges to this values as ^ cx). This method of estimating E[/] is called 
importance sampling (IS). 

IS is especially useful when P{{f 7^ 0}) is small. In such a case, ordinary 
Monte Carlo will require a large number of samples uJi for a reliable estimate 
of E[/]. One could choose P so that P{{f 7^ 0}) is no longer small and hope 
to reduce the number of samples required for a good estimate. A simple 
way to choose P so that this happens is to minimize the variance of the 
IS estimator s^r. Because sat is unbiased its variance depends on P only 
through its second moment which equals E[/^(l)]/A^. Here is the number 
of samples used in the estimation and is taken to be a constant. Therefore, 
for a good IS estimator one tries to solve the following optimization problem: 



(26) infE[/2(l)] = infE 
p p 



dP\' 
dp) 



infE 
p 



.dP_ 
dP. 



where the inf is over all P with respect to which l^f^QjdP is absolutely 
continuous. The exact solution to this problem turns out to have an easy 
description. It is simple to prove that dP* = -^jjdP is actually the minimizer 



of (26) and hence we have that 



E 



dP 



dP* 



inn? 



Then (E[/])^/A^ is the smallest possible second moment for an IS estimator 
which uses samples and the estimator defined by P* has zero variance. 
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5.1.1. Asymptotic Analysis. As is well known in the IS literature, P* is 
not a practical simulation measure because knowing it requires knowing E[/] 
which is the very quantity that is not known and whose estimation is sought. 
Therefore, one usually seeks an almost minimizer of (26) to conduct a good 
IS simulation. If there is a sequence fn whose expectation is sought, one way 
to find almost minimizers to (26) is to conduct an asymptotic analysis of 
the sequence of optimization problems given by (26) and the sequence fn- 
If these converge in some sense to a relatively simple limit problem then 
the optimizers of this limit problem can inform the construction of almost 
minimizers to (26) for the estimation of E[/„]. 

One setup where such an asymptotic analysis is possible is when the 
underlying measure P is that of a Markov process and 

(27) limilogE[/„] = 7 

exists and is nonzero. As the reader have already seen in the previous section, 
the problem in this article falls into this category. 

When the limit (27) exists, one can define an asymptotic optimality con- 
dition for a sequence of IS changes of measure as follows. Jensen's inequality 
and the unbiasedness of /(I) implies 

1 ~ 2 
liminf-logE[/^(l)l > liminf - logE[/(l)l =2j. 

n n ri n 

In other words, the exponential growth rate of the second moment of any 
sequence of IS samples is at least twice that of E[/„]. A sequence of IS 
estimators is said to be asymptotically optimal if the lower bound is achieved, 
i.e., if 

(28) limsup - logt[f(lf] = limsup - logE 

n n n n 

5.2. The IS problem for the Rao Bound. In the context of estimating the 
expectation representation (4) of the Rao bound using IS, /„ in (27) is 

n 

fn = l{S,,<t/2}Y[^r{Xj,j) 

where Sj is the symmetric random walk with increments Xj defined earlier. 
The reason IS is necessary is because of the i{s„<t/2} term. If we take t = fj,n 
with n < 1, as n goes to oo the probability of 5„ being less than t/2 goes to 
exponentially. In order to simulate X and S using importance sampling 
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one specifies a sampling distribution p{v\i,s), v € {0,1} and s S Zj and 
simulates X from this distribution as follows. One sets = 0. At step i 
of the simulation a random increment Xi is sampled from the distribution 
p{-\i, Si) and sets Sj+i = Xi + Si. Note that the distribution of the increment 
Xi is allowed to depend on the current position of the random walk S. Let 
P denote the probability measure on the sample paths of Sn defined by 
the transition probability p{-\i,s). Then the Radon Nikodym derivative ~ 



equals YVi 



0.5 

=1 p{x^\j,s^) 



Then, the IS estimator of E[/n] using K sample paths 



is 



(29) 



K 



JlY.fn{k), fn{k) = l 



n 



k=l 



\p{X^\j,S^) 



where denotes the k^^ independent sample path used in the simulation. 
The increments {-^^'^j are iid copies of the increment process X sampled from 
p. Then, by Theorem 4.2 the optimality condition (28) for the IS estimator 
(29) is 



(30) hm sup — log E 

n 



{Sn<t/2} 



n 



\p{Xj\j,S^) 



< 2y(o,o). 



5.3. The limit optimization problem. In the next subsection we will show 
that a sampling distribution p*(-|-, •) based on the large deviations analysis 
of the previous section satisfies (30), i.e., is asymptotically optimal. It turns 
out that for the proof we won't need a complete asymptotic analysis of 
the IS optimization problem (26). However, we include the following formal 
derivation of the limit optimization problem because it elucidates the direct 
connection between IS and large deviations analysis. As the reader will see, 
this connection is very general and not limited to the current problem and 
has been known at least heuristically for a long time, see for example [24] in 
the context of queuing networks. A more rigorous and clear connection has 
been established recently in [9, 10, 8, 27, 26]. 

Now we proceed with our formal derivation. For the present case, the IS 
optimization problem (26) becomes 



inf log E 

p 



{Sn<t/2} 



2r{Xj,j 



J}^p{X,\j,Sj 
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This equals 



inf sup E 

P p:P(5„<t„/2) 



^ 2 log r(X, , j) - \ogp{Xj li, Sj) - \ogp{Xj \j, Sj) 



by a direct generalization of Proposition 1 to the present case. It can be 
shown that this expression is convex in p and concave in p and therefore 
the order of the inf and sup can be switched without effecting the result. 
Once this is done the optimization in p gives the optimizer p* = p and the 
problem reduces to 



sup E 

P:P{Sn<tu/2) 



J22logr{X„j) -2logpiXj\j,S,] 



and this is the same problem as in the representation (8) except for a factor 
of 2. We know from the analysis of the previous section that when scaled by 
n this problem converges to 



sup / [2f{t)e{s) + 2H{e{s))]ds, 
e(-) Jo 

where the sup is over measurable ^ > such that Jq 6{s)ds < fi/2. This 
is the same as (9), again except for a factor of 2. Finally, and as before, 
because 2f{t)6 + 2H{6) is concave and independent of t ioi Ai < t < A^+i 
the last problem reduces to 



(31) 



sup <^ «i (^^2 \og{si - 1) + 2H{e,)) 



where the sup subject to 

0, G(0,1), {a,e)=ii/2. 

Therefore the limit optimization problems for the large deviations analysis 
and importance sampling are the same modulo a factor of 2. In particular, 
the minimizers 6* of (10) are also the minimizers of (31). 

5.4. An asymptotically optimal IS sampling measure based on LD anal- 
ysis. There are many asymptotically optimal IS sampling measures to es- 
timate (4). For example, one is p* of (22). The problem with this change 
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of measure is that it requires the solution of (18) at every step of the ran- 
dom walk Sn- For large n this is inefficient. A much preferable situation is 
a fixed change of measure, i.e., a change of measure p that doesn't depend 
on t and x. In the estimation of the Rao bound, we expect such a change 
of measure to exist for two reasons 1) the underlying process is iid and one 
dimensional 2) the probability of interest concerns exit from a region with 
only one boundary point. For more on these points we refer the reader to 
[9, 27] and [29, 24, 19]. Let us now construct a fixed change of measure for 
our problem. 

Let 6* to be the unique minimizers of (10) and define 



p* is almost fixed in the sense that it only depends on the step number j 
and not on the position x of the random walk Sn- The dependence on j is 
very intuitive and simple: each block of the orthogonal array has its own 
fixed jump probability 6*, for the steps corresponding to the i*^ block one 
uses this fixed probability to sample the increments of Sn- 

Before we state and prove our theorem which asserts that an IS estima- 
tion based on (32) is asymptotically optimal, we would like to make some 
comments and setup several things that we will need in the proof. Let us 
begin with the computation of (32). One simply uses (17) and (14) with 
t = and X = 0. Then 

where A* is the unique solution of 



Therefore, one can compute the IS change of measure p* of (32) by simply 
solving the simple one dimensional problem (34) to identify A* before the 
simulation begins. Throughout the simulation no further computation will 
be necessary to calculate p* . This is a great advantage over an IS simulation 
based on (22) which would require the solution of (34) at every step of the 
simulated random walk Sj. 

Subsolutions. A function V that satisfies 



(32) 





(34) 



sup {f{t)6 + H{9) + T4(x, t)9 + Vt{x, t)} > 



fe[o,i] 
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is called a subsolution to the PDE (19). In the next paragraph we will con- 
struct a subsolution to (19) and the proof of asymptotic optimality will be a 
control theoretic verification argument based on this subsolution. This tech- 
nique is from the "subsolution approach" to IS which was first developed in 
the context of queuing networks in [27, 8]. For a more general development 
see [12, 11]. The paper that precedes these articles and which introduced 
many of the ideas that underlie the subsolution approach is [9]. Other arti- 
cles using the approach include [26, 28, 7]. 

Usually, the subsolution approach is very useful for constructing good IS 
algorithms. This is the case in most of the aforementioned references. In 
the present case, we already have a simple algorithm and we will use the 
approach to prove that our algorithm is optimal. For the subsolution, let us 
cah it W, we set Wx{x,t) = — 2A* for ah {x,t) and choose Wt so that W 
solves (19): 

(35) Wt{t,x) = 2X*e* - 21og(s, - 1)6* - 2H{e*), A,<t< A,+i. 

These define W up to an additive constant. This is sufficient for our needs 
since only the increments and partial derivatives of W appear in a verifica- 
tion argument. By its construction W is piecewise affine, continuous and in 
fact a solution (and hence a subsolution) to (19). 

Remark 1. W \s a. solution to (19) and, as we have already noted in 
Theorem 4.1, so is V defined in (13). Evidently W . This is a common 
situation in optimal control, that is, an HJB equation may have many so- 
lutions. What makes V unique is that it is the maximal solution to (19). 
For more on these issues and a great deal of more information on stochastic 
optimal control we refer the reader to [16]. 

Besides being a solution to (19) here are two properties of W that play a 
key role in the optimality proof. 

Lemma 5.1. W^<^ and Ty(/i/2,l) - 1^(0,0) = 21^(0,0). 

Proof. Let g{X) denote the left side of (34). g \s a. decreasing function of 
A, with limA^oofi'(A) = and \\m.\^^ao g{\) = 1. (sj — l)/sj > 0.5, because 
each Si is an integer greater than 1. Then g[{)) > 0.5 > n/2. It follows that 
A* > 0. By definition Wx = — 2A* < and this is the first part of this lemma. 

By their definition 6^ satisfy J2i=i ^i^^i = ^Z^, see (33) and (34). Let xj = 
J2i=i^iO'i- One can write W{fi/2,l) — W{0,0) as the following telescoping 
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sum: 

W{^^/2^)-WM 

<J 

i=l 

By definition W is affine for t e with partial derivatives Wx = 

—A* and Wt given in (35), therefore this last sum equals 

= H Wxixi_i,Ai_i){xi - Xi_i) + Wt{xi_i,Ai_i){Ai - Ai_i) 

i=l 
a 

= '^^*0^ai - 2X*9*ai - 21og(s, - 1)6* - H{et) 

i=l 

cr 

= -Y,2\og{.Si-i)e* + 2H{et) 

i=l 

By definition 0* are the unique optimizers of (10), therefore 

= -2 sup |e log(s* - + H{e,)^ 

where the sup is subject to (11). This last quantity by definition equals 
—V{0, 0). This concludes the proof of the second part of this lemma. 

□ 

It follows directly from the definitions of Wt and 9* that 

(36) Wt + log (^e'^Hsi - 1)'^ + = 0- 

Let Xi be a Bernoulli random variable with P{Xi = 1) = 0.5. For integers 
X and Ai_in < j < AiU, one can represent the previous display probabilis- 
tically as 



(37) E 



p*{Xi\x,j) 



1 



Remark 2. The way it is presented above, (36) seems unmotivated. One 
should think of it as a multiplicative representation of (19). One can derive 
(36) directly from (19) first representing the optimization problem in that 
display as a trivial game and then using a representation result similar to 
(8). For a similar argument, see [27, Lemma 2.5.2]. 
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Theorem 5.2. The IS estimator based on p* of (32) is asymptotically 
optimal. 

The following proof follows the same steps as the optimality proof given 
in [8]. It is simpler because there is a fixed time horizon n so no truncation 
of time is needed. 

Proof. To ease notation let AWi denote W{{Sj+i, {j+l))/n)-W{{Sj,j)/n 
Define 

k-l n 



n 



r{X„i) 



,=1 P*{X^\^,S^) 

It follows from (37) that is a martingale and that 



E 



n-1 

{Sn<tn/2} n ^ 

i=l 



nAWi 



r{Xi,i) 



p*{Xi\i,Si 



We saw in Lemma 5.1 that Wx < 0, therefore 

n-1 

nAW^ = n{W{Sn/n, 1) - W{0, 0)) > n{W{n/2, 1) - W{0, 0)) = -2nV{0, 0) 



on {Sn < /in/2}. The last two displays imply 

n-1 



^2nVm > ^ 



{S„<in/2} 



1=1 



p*{Xi\i,Si) 



Taking the log of both sides, dividing by n and letting n go to oo proves that 
(30) holds for the change of measure p*{-\-, ■)• i-e., the IS change algorithm 
defined by this change of measure is asymptotically optimal, which is what 
we wanted to prove. □ 

6. The Gilbert- Varshamov Bound. The results derived for the Rao 
bound (1) in sections 3, 4 and 5 can be derived for the Gilbert- Varshamov 
bound (2). The analysis and the results are essentially the same, the main 
difference is that fi replaces /i/2 in (21) and other similar places. 

The key quantity in (2) is 



t-i 



(38) E E 



i=0 "ij^2,. 



\u„-l 



a-1 

n 

m=l 



Im 



[Sn 
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Let Sn, Xi and r be defined as in Section 3. The expectation representation 
of (38) is 



(39) 



n-l 



{Sn-l<t~l} 



X{2r{Xi,i) 



i=l 



This is exactly the same as (4), except for the fohowing differences. 

1. The expectation is over a random walk that takes n — 1 steps, rather 
than n, 

2. There is a Sa factor in front, 

3. The expectation is over those trajectories such that Sn-i <t — l rather 
than Sn<t/2. 

As was the case in Section 4 the asymptotic analysis of (40) will involve a 
^ log scaling. Under this scaling the asymptotics of (40) is the same as that 
of 



(40) 



E 



i=l 



Let /j, Oj, tj, IX be as in (6). Theorem 4.2 implies 
(41) 



lim — log E 

n n 



hSn<tr.}IlMX„j) 



sup<^^ai {9ilogisi-l)+H{ei)) 



where the sup is over 

(42) ^^£(0,1), (a,^)</i. 

If /i € (0.5, 1) then {Sn < l^n} is not a rare event and there is no need for 
IS to simulate (40) effectively, one can use straight forward Monte Carlo for 
this purpose. Otherwise, Theorem 5.2 implies that the minimizers of (41) 
define an asymptotically optimal IS change of measure to estimate (40). 

7. Numerical Results. We used the Octave numerical computation 
environment [13] for the numerical computations in this section. 



7.1. The Rao Bound. 
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Example 1.. Consider the following parameter values for an orthogonal 
array: a = 4, alphabet sizes s = [ 13 10 7 5], the block lengths / = [20 20 
20 20] and t = 4. Then n = 80 the scaled strength parameter ji = 0.05, and 
length parameters a = [0.25 0.25 0.25 0.25]. 

For this example, the exact Rao bound can be computed in two ways: 
either using the original formula (1) or the recursive algorithm (5). Both of 
these algorithms very quickly yield the value 190051. 

We solve (10) with the above parameter values to get the large deviation 
decay rate 1/(0,0) = 0.1681. Then the large deviation estimate of the Rao 
bound is e^^^'*^^" = e^^'^^ = 689760 which is about three times larger than 
the actual bound found above. This type of inaccuracy is expected since an 
LD analysis only identifies the exponential growth rate. 

We know from Section 5 that if the optimizers of (10) are used as an 
IS change of measure in (29) the resulting IS algorithm is asymptotically 
optimal. The optimizers of (10) for the above value of parameter values is 
e* = (0.0383 0.0290 0.0195 0.0131). Below are five estimation results using 
this algorithm with K = 2000 sample paths. The standard error column 
presents the estimated standard deviation a{sK)- The informal 95% confi- 
dence intervals are [sk — 2a{sK) sk + 2(T(si<-)]. 





Estimate sk 


Standard Error 


95 % CI 


Scaling 


Est. 


1 


1.94 


0.06 


[1.82 2.06] 




Est. 


2 


1.82 


0.06 


[1.70 1.94] 




Est. 


3 


1.83 


0.06 


[1.71 1.95] 


xlO^ 


Est. 


4 


1.82 


0.06 


[1.70 1.94] 




Est. 


5 


1.92 


0.06 


[1.80 2.04] 





The results in the table suggest that the asymptotically optimal IS scheme 
derived in Section 5 also perform well in practice. All of the estimates are 
close to the actual value, the formal confidence intervals are tight and they 
all happen to contain the exact Rao bound. 

Example 2. Now consider a = 40, alphabet sizes Si = 20 + i, block lengths 
li = 20, i = 1,2, ...,40 and strength parameter t = 20. Then n = 800, 
= 0.025, and = 0.025. 

For this example, the complexity analysis (3) in Section 2 indicate that 
the direct computation of (1) would require about 10^^ operations, which 
of course is not possible to perform in any reasonable amount of time. The 
recursive algorithm (5) yields 2.57 x 10^^ in a second or less. Obviously this 
is an impractically large value and it is clear that it is impossible to build 
an orthogonal array for the parameters listed above. 

The large deviation decay rate V{0,0) turns out to be 0.113 for this 
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problem and the corresponding large deviation estimate of (1) is e^^^''^)" = 
g90.4 _ -[^ g2 X 10'^'^, which is, at the scale of 10^^, close to the exact value. 

The optimizers of (10) is a forty dimensional vector and is inconvenient 
to list explicitly. The IS estimate based on (1) using these optimizers and 
K = 1000 samples are as follows: 





Estimate sk 


Standard Error 


95 % CI 


Scaling 


Est 


2.49 


0.14 


[2.21 2.77] 




Est 


2.58 


0.14 


[2.30 2.86] 




Est 


2.43 


0.14 


[2.15 2.71] 


xlO^^ 


Est 


2.35 


0.14 


[2.07 2.63] 




Est 


2.55 


0.14 


[2.27 2.83] 





As in the first example, practical performance of the IS estimator is very 
good here. All the estimates are close to the exact value, the confidence 
intervals are tight and they all happen to contain the exact value. The run 
time for each estimation is around a second. 

7.2. The Gilbert Varshamov Bound. Let us continue with the previous 
parameter values. The computation for this bound is the same as Rao bound. 
In the example below, we calculate the expectation (40) rather than the 
actual quantity (38), which is a multiple of the expectation. We can use our 
recursive algorithm (5) to compute the exact GV bound (2) to be 3.13 x lO''^. 
The large deviation growth rate V{0, 0) = 0.2088 and the large deviation 
estimate of the GV bound is e^(°'°)^°° = 2.85 x 10^^ The IS results are: 





Estimate sk 


Standard Error 


95 % CI 


Scaling 


Est 1. 


3.5288 


0.23467 


[3.0595 3.9981] 




Est 2. 


3.4698 


0.23083 


[3.0082 3.9315] 




Est 3. 


3.4154 


0.2258 


[2.9638 3.867] 


xlO^^ 


Est 4. 


3.2821 


0.21933 


[2.8434 3.7207] 




Est 5. 


2.8326 


0.19576 


[2.4411 3.2242] 





Once again they are accurate and reliable. 



7.3. A comparison. A comparison of the asymptotic versions of the Rao 
and the GV bounds is given Figure 2. Take g = 2, si = 2, S2 = 4, S3 = 8 
and S4 = 16, a, = 0.25. The following graph depicts the Rao and the GV 
asymptotic bounds for /i € [0,1]. The large gap between them is due to 
the difference of a factor of 2 between the constraints of the Rao and the 
GV bounds. The GV bound is flat for larger values of /i. This is because for 
these values of fi the unique global maximizer of (41) satisfies the constraint 
(42). 
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Fig 2. A comparison of the Rao and GV bounds 
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